Dynamically weighted clustering with noise set

نویسندگان

  • Yijing Shen
  • Wei Sun
  • Ker-Chau Li
چکیده

MOTIVATION Various clustering methods have been applied to microarray gene expression data for identifying genes with similar expression profiles. As the biological annotation data accumulated, more and more genes have been organized into functional categories. Functionally related genes may be regulated by common cellular signals, thus likely to be co-expressed. Consequently, utilizing the rapidly increasing functional annotation resources such as Gene Ontology (GO) to improve the performance of clustering methods is of great interest. On the opposite side of clustering, there are genes that have distinct expression profiles and do not co-express with other genes. Identification of these scattered genes could enhance the performance of clustering methods. RESULTS We developed a new clustering algorithm, Dynamically Weighted Clustering with Noise set (DWCN), which makes use of gene annotation information and allows for a set of scattered genes, the noise set, to be left out of the main clusters. We tested the DWCN method and contrasted its results with those obtained using several common clustering techniques on a simulated dataset as well as on two public datasets: the Stanford yeast cell-cycle gene expression data, and a gene expression dataset for a group of genetically different yeast segregants. CONCLUSION Our method produces clusters with more consistent functional annotations and more coherent expression patterns than existing clustering techniques. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bilateral Weighted Fuzzy C-Means Clustering

Nowadays, the Fuzzy C-Means method has become one of the most popular clustering methods based on minimization of a criterion function. However, the performance of this clustering algorithm may be significantly degraded in the presence of noise. This paper presents a robust clustering algorithm called Bilateral Weighted Fuzzy CMeans (BWFCM). We used a new objective function that uses some k...

متن کامل

Sample-weighted clustering methods

Keywords: Cluster analysis Maximum entropy principle k-means Fuzzy c-means Sample weights Robustness a b s t r a c t Although there have been many researches on cluster analysis considering feature (or variable) weights, little effort has been made regarding sample weights in clustering. In practice, not every sample in a data set has the same importance in cluster analysis. Therefore, it is in...

متن کامل

A Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset

Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...

متن کامل

An adaptive dynamically weighted median filter for impulse noise removal

A new impulsive noise removal filter, adaptive dynamically weighted median filter (ADWMF), is proposed. A popular method for removing impulsive noise is a median filter whereas the weighted median filter and center weighted median filter were also investigated. ADWMF is based on weighted median filter. In ADWMF, instead of fixed weights, weightages of the filter are dynamically assigned with th...

متن کامل

Image Segmentation Using FELICM Clustering Method

Clustering is the task of grouping a set of objects in such a way that objects are more similar to each other than those in the other groups. Various clustering algorithms were developed, but it ignores the spatial relationship between pixel values then noise can be added to the image and it does not provide edge detection accuracy. Fuzzy local information C-means is the best image clustering m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 26  شماره 

صفحات  -

تاریخ انتشار 2010